Goto

Collaborating Authors

 average causal effect


Identification and Estimation under Multiple Versions of Treatment: Mixture-of-Experts Approach

Yoshikawa, Kohei, Kawano, Shuichi

arXiv.org Machine Learning

Identification and Estimation under Multiple Versions of Treatment: Mixture-of-Experts Approach Kohei Y oshikawa Shuichi Kawano January 5, 2026 Abstract The Stable Unit Treatment Value Assumption (SUTV A) includes the condition that there are no multiple versions of treatment in causal inference. Though we could not control the implementation of treatment in observational studies, multiple versions may exist in the treatment. It has been pointed out that ignoring such multiple versions of treatment can lead to biased estimates of causal effects, but a causal inference framework that explicitly deals with the unbiased identification and estimation of version-specific causal effects has not been fully developed yet. Thus, obtaining a deeper understanding for mechanisms of the complex treatments is difficult. In this paper, we introduce the Mixture-of-Experts framework into causal inference and develop a methodology for estimating the causal effects of latent versions. This approach enables explicit estimation of version-specific causal effects even if the versions are not observed. Numerical experiments demonstrate the effectiveness of the proposed method. Keywords causal inference multiple versions of treatment compound treatments mixture-of-experts EM algorithm 1 Introduction In the theory of causal inference, a fundamental starting point is the potential outcomes framework since Rubin (1980), whose core assumption is the Stable Unit Treatment Value Assumption (SUTV A).


Causal Inference, Biomarker Discovery, Graph Neural Network, Feature Selection

Lan, Chaowang, Wu, Jingxin, Yuan, Yulong, Liu, Chuxun, Kang, Huangyi, Liu, Caihua

arXiv.org Artificial Intelligence

Biomarker discovery from high-throughput transcriptomic data is crucial for advancing precision medicine. However, existing methods often neglect gene-gene regulatory relationships and lack stability across datasets, leading to conflation of spurious correlations with genuine causal effects. To address these issues, we develop a causal graph neural network (Causal-GNN) method that integrates causal inference with multi-layer graph neural networks (GNNs). The key innovation is the incorporation of causal effect estimation for identifying stable biomarkers, coupled with a GNN-based propensity scoring mechanism that leverages cross-gene regulatory networks. Experimental results demonstrate that our method achieves consistently high predictive accuracy across four distinct datasets and four independent classifiers. Moreover, it enables the identification of more stable biomarkers compared to traditional methods. Our work provides a robust, efficient, and biologically interpretable tool for biomarker discovery, demonstrating strong potential for broad application across medical disciplines.


Testing Causal Explanations: A Case Study for Understanding the Effect of Interventions on Chronic Kidney Disease

Petousis, Panayiotis, Gordon, David, Nicholas, Susanne B., Bui, Alex A. T.

arXiv.org Artificial Intelligence

Randomized controlled trials (RCTs) are the standard for evaluating the effectiveness of clinical interventions. To address the limitations of RCTs on real-world populations, we developed a methodology that uses a large observational electronic health record (EHR) dataset. Principles of regression discontinuity (rd) were used to derive randomized data subsets to test expert-driven interventions using dynamic Bayesian Networks (DBNs) do-operations. This combined method was applied to a chronic kidney disease (CKD) cohort of more than two million individuals and used to understand the associational and causal relationships of CKD variables with respect to a surrogate outcome of >=40% decline in estimated glomerular filtration rate (eGFR). The associational and causal analyses depicted similar findings across DBNs from two independent healthcare systems. The associational analysis showed that the most influential variables were eGFR, urine albumin-to-creatinine ratio, and pulse pressure, whereas the causal analysis showed eGFR as the most influential variable, followed by modifiable factors such as medications that may impact kidney function over time. This methodology demonstrates how real-world EHR data can be used to provide population-level insights to inform improved healthcare delivery.


An R package for parametric estimation of causal effects

Anderson, Joshua Wolff, Rakovski, Cyril

arXiv.org Artificial Intelligence

Causality has been defined with the identification of the cause or causes of a phenomenon by establishing covariation of cause and effect, a time-order relationship with the cause preceding the effect, and the elimination of plausible alternative causes; see Shaughnessy et al. (2000). To claim a specific causal effect between two variables is quite a strong claim. First, there needs to be well-defined treatment and outcome with an established covariance. Second, the treatment must proceed the observed outcome. Third, there must be no other present confounders, i.e., other "treatments" that could have their own causal effect; see Judea (2010). While these conditions are not perfect parameters for inferring a causal relationship between a treatment and outcome, they help researchers remove strong bias from their studies; see Hammerton and Munafò (2021). A causal effect found in a causal inference study is almost never the true causal effect, rather a less-biased estimate that is significantly closer to the true causal effect of the treatment on the outcome. To calculate a true causal effect would require "counterfactual" outcomes that cannot be measured; see Judea (2010). To describe a counterfactual outcome, let us define some treatment Z and an outcome Y.


Estimating average causal effects from patient trajectories

Frauen, Dennis, Hatt, Tobias, Melnychuk, Valentyn, Feuerriegel, Stefan

arXiv.org Artificial Intelligence

In medical practice, treatments are selected based on the expected causal effects on patient outcomes. Here, the gold standard for estimating causal effects are randomized controlled trials; however, such trials are costly and sometimes even unethical. Instead, medical practice is increasingly interested in estimating causal effects among patient (sub)groups from electronic health records, that is, observational data. In this paper, we aim at estimating the average causal effect (ACE) from observational data (patient trajectories) that are collected over time. For this, we propose DeepACE: an end-to-end deep learning model. DeepACE leverages the iterative G-computation formula to adjust for the bias induced by time-varying confounders. Moreover, we develop a novel sequential targeting procedure which ensures that DeepACE has favorable theoretical properties, i.e., is doubly robust and asymptotically efficient. To the best of our knowledge, this is the first work that proposes an end-to-end deep learning model tailored for estimating time-varying ACEs. We compare DeepACE in an extensive number of experiments, confirming that it achieves state-of-the-art performance. We further provide a case study for patients suffering from low back pain to demonstrate that DeepACE generates important and meaningful findings for clinical practice. Our work enables practitioners to develop effective treatment recommendations based on population effects.


Estimating the average causal effect of intervention in continuous variables using machine learning

Kitazawa, Yoshiaki

arXiv.org Machine Learning

The causal effect is defined by Pearl's do operation as a probability distribution over observed data in the way that it is altered from one which generates data originally [Pearl 1995; Pearl 2009]. When dealing with causal effects in realworld problems, it is also necessary to take into account unobserved variables that is not included in data. In general, causal effects are counterfactual probability distributions that differ from data generating systems in the real world. When we consider the existence of unobserved data, it becomes a problem if it can be determined by observed data available. That is, we need to consider the identifiability of causal effects in this case. This problem has recently been resolved to a certain extent [Tian and Pearl 2002; Shpitser and Pearl 2006; Shpitser and Pearl 2012].


Causal inference with imperfect instrumental variables

Miklin, Nikolai, Gachechiladze, Mariami, Moreno, George, Chaves, Rafael

arXiv.org Machine Learning

Instrumental variables allow for quantification of cause and effect relationships even in the absence of interventions. To achieve this, a number of causal assumptions must be met, the most important of which is the independence assumption, which states that the instrument and any confounding factor must be independent. However, if this independence condition is not met, can we still work with imperfect instrumental variables? Imperfect instruments can manifest themselves by violations of the instrumental inequalities that constrain the set of correlations in the scenario. In this paper, we establish a quantitative relationship between such violations of instrumental inequalities and the minimal amount of measurement dependence required to explain them. As a result, we provide adapted inequalities that are valid in the presence of a relaxed measurement dependence assumption in the instrumental scenario. This allows for the adaptation of existing and new lower bounds on the average causal effect for instrumental scenarios with binary outcomes. Finally, we discuss our findings in the context of quantum mechanics.


Instance-wise Causal Feature Selection for Model Interpretation

Panda, Pranoy, Kancheti, Sai Srinivas, Balasubramanian, Vineeth N

arXiv.org Artificial Intelligence

We formulate a causal extension to the recently introduced paradigm of instance-wise feature selection to explain black-box visual classifiers. Our method selects a subset of input features that has the greatest causal effect on the models output. We quantify the causal influence of a subset of features by the Relative Entropy Distance measure. Under certain assumptions this is equivalent to the conditional mutual information between the selected subset and the output variable. The resulting causal selections are sparser and cover salient objects in the scene. We show the efficacy of our approach on multiple vision datasets by measuring the post-hoc accuracy and Average Causal Effect of selected features on the models output.


Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals

Ghassami, AmirEmad, Ying, Andrew, Shpitser, Ilya, Tchetgen, Eric Tchetgen

arXiv.org Machine Learning

A moment function is called doubly robust if it is comprised of two nuisance functions and the estimator based on it is a consistent estimator of the target parameter even if one of the nuisance functions is misspecified. In this paper, we consider a class of doubly robust moment functions originally introduced in (Robins et al., 2008). We demonstrate that this moment function can be used to construct estimating equations for the nuisance functions. The main idea is to choose each nuisance function such that it minimizes the dependency of the expected value of the moment function to the other nuisance function. We implement this idea as a minimax optimization problem. We then provide conditions required for asymptotic linearity of the estimator of the parameter of interest, which are based on the convergence rate of the product of the errors of the nuisance functions, as well as the local ill-posedness of a conditional expectation operator. The convergence rates of the nuisance functions are analyzed using the modern techniques in statistical learning theory based on the Rademacher complexity of the function spaces. We specifically focus on the case that the function spaces are reproducing kernel Hilbert spaces, which enables us to use its spectral properties to analyze the convergence rates. As an application of the proposed methodology, we consider the parameter of average causal effect both in presence and absence of latent confounders. For the case of presence of latent confounders, we use the recently proposed proximal causal inference framework of (Miao et al., 2018; Tchetgen Tchetgen et al., 2020), and hence our results lead to a robust non-parametric estimator for average causal effect in this framework.


On the Non-Monotonicity of a Non-Differentially Mismeasured Binary Confounder

Peña, Jose M.

arXiv.org Machine Learning

Suppose that we are interested in the average causal effect of a binary treatment on an outcome when this relationship is confounded by a binary confounder. Suppose that the confounder is unobserved but a non-differential binary proxy of it is observed. We identify conditions under which adjusting for the proxy comes closer to the incomputable true average causal effect than not adjusting at all. Unlike other works, we do not assume that the average causal effect of the confounder on the outcome is in the same direction among treated and untreated.